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ABSTRACT 

We model the abundance of haloes in the ~ (3 Gpc /h) 3 volume of the MICE Grand 
Challenge simulation by fitting the universal mass function with an improved Jack- 
Knife error covariance estimator that matches theory predictions. We present unifying 
relations between different fitting models and new predictions for linear (b 1 ) and non¬ 
linear (02 and C3) halo clustering bias. Different mass function fits show strong vari¬ 
ations in their performance when including the low mass range (Mh < 3 10 12 M & /h ) 
in the analysis. Together with fits from the literature we find an overall variation in 
the amplitudes of around 10% in the low mass and up to 50% in the high mass (galaxy 
cluster) range (Mh > 10 14 Mq/K). These variations propagate into a 10% change in 
b\ predictions and a 50% change in C2 or C3. Despite these strong variations we find 
universal relations between b\ and C 2 or C 3 for which we provide simple fits. Exclud¬ 
ing low mass haloes, different models fitted with reasonable goodness in this analysis, 
show percent level agreement in their b\ predictions, but are systematically 5 — 10% 
lower than the bias directly measured with two-point halo-mass clustering. This result 
confirms previous findings derived from smaller volumes (and smaller masses). Inac¬ 
curacies in the bias predictions lead to 5 — 10% errors in growth measurements. They 
also affect any HOD fitting or (cluster) mass calibration from clustering measurements. 

Key words: galaxies: haloes - galaxies: abundances - methods: analytical methods: 
statistical - dark matter - large-scale structure of Universe 


1 INTRODUCTION 

Observations of structures in the large-scale distribution of 
galaxies are a powerful tool for constraining cosmological 
models. However, such constraints require a model which 
connects the galaxy distribution to the matter density field. 
Various observations support the gravitational instability 
paradigm in which galaxies form in potential wells, gener¬ 
ated by the gravitational collapse of dark matter into haloes. 
The relation between the halo and full matter density fields 
(ph and pm respectively) is therefore a crucial ingredient for 
a precise large-scale structure analysis. In fact, the uncer¬ 
tainties in this relation strongly increase the errors in the 
Dark Energy equation of state or gravitational growth in¬ 
dex f rom future galaxy surveys (e.g. lEriksen and Gaztanagal 
120151) . 

A formal approach for de scribing the halo-ma t ter de ns- 
ity relation was suggested bv lFrv and Gaztanagal dl993l l as 
the Taylor expansion around the matter density contrast, 
5m, at the same position, known as bias function 

N h 

&(r) = ^ m (r)]^|t(r), (1) 

i=0 


where d(r) = (p(r) — p)/p, p is the mean density and r de¬ 
notes the spatial position. For the construction of <S(r) the 
density field is commonly smoothed with a top-hat filter of 
size R. The coefficients bi are the so called bias paramet¬ 
ers, while we will investigate non-linear bias in terms of the 
ratios C2 = &2/61 and C3 = 63/fci. The relation in equation 
0 corresponds to a local bias model in which the dens¬ 
ity of galaxies is fully determined by the matter density at 
the same position, while environmental effects are not con¬ 
sidered. Recent studies demonstrated that the local model 
is inadequate as tidal forces of the surrounding large-scale 
structure generate non-loca l contributions to th e bias func¬ 
tion (e.g. IChan et alj|2012l : iBaldauf et all I 20 I 2 I I. However, 
the two-point correlation, commonly used to study galaxy 
clustering, is primarily sensitive to the linear bias parameter 
b 1 at scales between 20 — 60 /i _1 Mpc. Due to the level of pre¬ 
cision achieved in our analysis we will not take non-linear 
and non-local bias contributions to the two-point correlation 
into account. Note that, especially at smal ler scales, such a 
negligence would no t be appropriate (e.g. ISaito et al.ll201 -'ll : 
iBiagetti et alJl2014f ). 

Besides the clustering also the abundance of ha¬ 
loes as a function of halo mass (known as the mass 
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function) is related to the bias function. This relation 
can be understood with the peak-ba ckground s plit model 
(hereafter referred to as PBS model, j Bardee n et, al.1 ll 98(1 : 
ICole and Kaiseilll989l : iMo and Whitdll996ll . In this model, 
large-scale density fluctuations are superposed with fluctu¬ 
ations at small scales. These large-scale density fluctuations 
modulate the background cosmology (i.e. the mean density 
and the Hubble rate) ar ound small-scale fluctuations (e.g. 
iMartino and Shethi 12009 1. The critical density contrast for 
gravitational collapse therefore depends on the environment. 
In regions with large-scale overdensities more small-scale 
fluctuations collapse to haloes than in underdense regions. 
This effect modifies the abundance of haloes and also their 
spatial distribution as they follow the pattern of the peaks of 
large-scale fluctuations. Haloes are therefore biased tracers 
of large-scale fluctuations in the full matter density field. For 
a given matter power spectrum the halo bias parameters can 
be predicted from the mass function via the PBS model. 

The PBS bias predictions can be used to determine 
the dark matter clustering from observed galaxy distri¬ 
butions if the halo masses of a given tracer sample are 
known (or the other way round). Such an analysis requires 
that the bias parameters, predicted from the mass func¬ 
tion, are equivalent with the bias which affects the clus¬ 
tering. Studies of this equivalence have revealed that the 
PBS predictions for the linear bias b\ are around 10% 
below measurements from two-point clustering statistics. 
Such deviations might result from assumptions of the PBS 
mod el, such as spherical collapse, or a local bias relation 

(e.g.lMo et al.lfl 997: Dcsi acaues et al.ll2010l : IParaniape et al.l 


l2013l : Schmidt et al.l 20131 1. Further numerical effects, like 
the definition of haloes in N-body simulations, or system¬ 
atic effects such as the parametrisation and fitting pro¬ 
cedure of the mass function might contribute to the dis- 
crepancy between the bias from PBS and clustering (e.g. 
iHu and Kravtsovll2003l : |Manera et al.ll2010l l. Predictions of 
the PBS for the relation between halo mass and bias are also 
employed in Halo Occupation Distribution models to predict 
the bias as a fun ction of galaxy properties^ such as luminos¬ 
i ty or color (e,g. Coora^_and_Sheth l2002l;lMore et ah Haul 

H2015I 


ICoupon et alj I2OI2I : Carretero et al . 2 oiai . Inaccuracies of 

the PBS can affect such halo model predictions for galaxy 
bias or the average number of galaxies per halo. Moreover, 
haloes of equal mass could have differ ent galaxy occupation, 
depe nding on their environment (e.g. IPuiol and Gaztanagal 
l2014l l. Besides clustering analysis the PBS can be employed 
for estimating the lower mass threshold (or mass-observable 
relation) of observ ed galaxy samples. This so -called self cal¬ 
ibration method dLima and Hul 120041 . l2005lf uses the fact 
that both, the clustering and the abundance of haloes, de¬ 
pend on halo mass. Inaccuracies of the PBS model can 
change the estimation of halo mass thresholds and therefore 
change the c osmological parameters i nferred from such a n 
analysis (e.g. IWu et ahll2010l ; iManera and Gaztanagall201lh . 

The broad application of the PBS model in large-scale 
structure analysis and the precision of abundance and clus¬ 
tering measurements from incoming observational data calls 
for a detailed validation of the PBS bias predictions. The 
purpose of this analysis is to pursue the study of deviations 
between halo bias measurements from clustering and PBS 
predictions using the wide mass range of the MICE Grand 
Challenge (hereafter referred to as MICE-GC) simulation 


jFosalbf^etjilJ 2015a H Crocce et al.l 120131 : ICarretero et al.l 
120151 : Hoffmann et al.1 2015h . We thereby focus on the effect 
of mass function parametrisation and fitting on PBS bias 
predictions. The mass function fits are affected by the er¬ 
ror estimations. Our analysis therefore includes a detailed 
study of the mass function error and covariance which leads 
us to an improvement of the standard Jack-Knife estimator. 
The study of PBS bias predictions includes non-linear bias 
parameters which are important for an analysis of higher- 
order correlations of the large-scale halo distribution and 
two-point correlations at small scales. We further compare 
the mass function fits and bias predictions with results from 
the literature based on different simulations, to verify a uni¬ 
versal behaviour of these quantities. 

This paper is organised as follows. In Section 2 we 
present the MICE-GC simulation and mass function fits. 
In Section 3 we present new galaxy bias predictions which 
we compare with the literature and find a universal relation 
between bias parameters. In Section 4 we compare these pre¬ 
dictions with the bias directly measured from the two-point 
halo-matter cross-correlations of the MICE-GC simulation. 
The comparison with higher-order clustering will be presen¬ 
ted in a separate paper (Bel, Hoffmann & Gaztanaga, in pre¬ 
paration) . A summary is given together with our conclusions 
in Section 5. In Appendix B we present a new method to 
improve the Jack-Knife covariance matrix estimation. This 
method can also be easily generalised to other statistics, 
such as the two-point correlation function (Hoffmann et al., 
in preparation). 


2 SIMULATION & HALO MASS FUNCTION 

Our analysis is based on dark matter haloes, identified in 
the comoving outputs of the MICE-GC simulation at the 
redshifts 2 = 0.0 and 0.5. Starting from small initial density 
fluctuations at redshift z = 100 the formation of large-scale 
cosmic structure was computed with 4096 3 gravitationally 
interacting collisionless parti cles in a 3072 fe ~ 1 Mpc box us¬ 
ing the GADGET - 2 code (ISpringell l2005l l with a soften¬ 
ing length of 50 h _1 kpc. The initial conditions were gener¬ 
ated using the Zel’dovich approximation and a CAMB power 
spectrum with the power law index of n s = 0.95, which was 
normalised to be as = 0.8 at 2 = 0.0. The cosmic expan¬ 
sion is described by the ACDM model for a flat universe 
with a mass density of = Qdm + fl b = 0.25. The density 
of the baryonic mass is set to Q b = 0.044 and fldm is the 
dark matter density. The dimensionless Hubble parameter 
is set to h = 0.7. More de tails and valid a tion te sts on this 
simulation can be found in lFosalba et al.l (l2015al l. 

Dark matter haloes were identified as Frien ds-of-Friends 
group s (hereafter referred to as FoF groups, iDavis et al.l 
1 1985T) with a redshift independent linking length of 0.2 in 
units of the mean particle separation. These halo catalogues 
and the corr e spond ing validation checks are presented in 
ICrocce et al.l (l2013l l. To study the galaxy bias as a func¬ 
tion of halo mass we divide the haloes into the four redshift 
independent mass samples M0, Ml, M2 and M3, shown in 
Table [T) These samples span a mass range from Milky Way 
like haloes up to massive galaxy clusters. In our analysis we 
consider mass function fits over different mass ranges which 
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sample mass range [10 12 h 1 Mq] N p A7, 


M0 

0.58 - 2.32 

20 - 80 

122300728 

Ml 

2.32 - 9.26 

80 - 316 

31765907 

M2 

9.26 - 100 

316 - 3416 

8505326 

M3 

> 100 

A 3416 

280837 

Table 1. 

Halo mass samples. N p 

is the number of dark matter 


particles per halo, is the number of haloes per sample in the 
comoving output at redshift z = 0.5. 


we label as M0123, M123, M23 or M012, following the nota¬ 
tion in Table [3] 


2.1 Mass function definition and measurement 


The unconditional mass function, dn(m), is defined as the 
comoving number density of haloes with masses between m 
and m + dm. The mass function can be written in a form 
which is nearly inde pendent of redshift, cosmology and ini¬ 
tial power spectrum (IPr ess and Schechterll 19741 ; 1 Bond et al.l 


1 PC_ 

Il99ll: Sheth and Tormen 


yfC) 


m dn(m) 
p d In v ’ 


( 2 ) 


where p is the mean comoving mass density. The height of 
density peaks is defined as 

v = 5l/af(m), (3) 


where 5 C = 1.686 is the critical density for spherical col¬ 
lapse (which is the exact solution value for the spherical 
collapse in an Einstein-de Sitter universe). The variance of 
matter density fluctuations, smoothed with a spher¬ 

ical top-hat window with radius R(m) = (Sm/Anp) 1 ^ 3 , can 
be calculated as 

2 , . f dk k 3 P(k) 2 /. .. , , 

Vm(m) = J — 2 ^ 2 W ( kR{m )) (4) 

where W(x ) = (3/a; 3 )(sina; —icosi) is the spherical top-hat 
window in Fourier space and P(k) is the linear power spec¬ 
trum. Note that m refers to the matter density field when it 
appears as lower index and to the mass, enclosed by R(m), 
when used as a variable. We measure the mass function in 
the MICE-GC simulation at redshift z and convert it to 
to predic t the halo bias parameters fei, C 2 and C 3 via 
the PBS theory (iBardeen et al.lfl986l : ICole and Kaiserlll 9891 : 
iMo and Wh ite 1996 ). We do not apply the halo mass correc¬ 
tion suggested by Warren et al.l (120061 ) for low mass resolu¬ 
tion, since we analyse haloes down to 20 particles, while this 
correction was only proposed for larger numbers of particles 
per halo. Furthermore, it is not clear that the FoF mass, 
corrected in such a way is closer to the halo mass on which 
the PBS model is based on. But note that our results do 
not depend on this correction as illustrated in Fig. [4] More 
details about how we measure the mass function are given 
in Appendix m 

Our measurements of vf(v) at z = 0.0 and 2 = 0.5 are 
shown as symbols in Fig. [T] As expected, they agree visu¬ 
ally with the idea of a weak redshift dependence for FoF 
mass functio ns when using a redshift independent linking 
lengths (e.g . IPress and Schechterlll974l : Ijenkins et al.l[ 200 ll : 
iMore et al.l 1201 11). Errors and covariances of the measure¬ 
ments were derived with a new estimator which combines 


the JK approach with predictions for sampling variance from 
the power spectrum (see Appendix HI. We also show in Fig. 
|T]fits to the m easurements, based on the mass function para- 
metrisation of lTinker et al.l (l2010l . equation (|5|l) . The model, 
fitted over the mass range M123 (that is, excluding the low 
mass sample M0, see Table S is in reasonable agreement 
with the measurements. Including the sample M0 (haloes 
with less than 80 particles) to the fitting range leads to poor 
fits of the model. The fits at both redshifts differ by less 
than 5% for ln(i') < 3, confirming low redshift dependence 
from the measurements. The redshift dependence is stronger 
when lower masses are included in the fitting range, possibly 
because of redshift dependent noise in the low mass FoF 
detection. At larger masses (ln(^) > 3) we find up to 10% 
deviations, which are comparable with the mass function er¬ 
rors. We have verified that our conclusions also hold for fits 
over the higher mass range, M23, and different mass func¬ 
tion binnings. A detailed analysis of the mass function fits, 
including fits of other mass function models over different 
mass ranges and different binnings as well as a comparison 
with fits compiled from the literature, can be found in the 
next section. 


2.2 Mass function fits 


In order to predict the halo bias from the mass function 
via the PBS approach we fit different mass function mod¬ 
els to the measurements. Several systematic effects, such 
as the choice of the mass function model or the mass 
range over which the model is fitted can limit the accur¬ 
acy of the PBS bias p r edicti ons (e.g. iManera et al.l 120101 : 
iManera and Gaztanaeal 120111 ) . The objective of the sub¬ 
sequent analysis is to find out how strongly these effects 
impact the predicted linear, quadratic and third-order bias 
coefficients. In particular we aim to verify if the disagree¬ 
ment between PBS predictions for the linear bias and the 
corresponding measurements from two-point correlations, 
presented in Section [4] is driven by possible shortcomings 
of the mass function fits. We therefore study in this sub¬ 
section the fitting performances of different mass function 
models. 

The latest model in our analysis with the highest 
number of f ree p arameters is the expression given by 
I Tinker et alj (l2010l ) (hereafter referred to as Tinker model). 
It can be written as 

uf{y) = A[l + {bv) a ]v d e- cv/ \ (5) 


where A, a, b, c, d are the free parameters. We have redefined 
the parameters so that fixing certain parameters delivers ex¬ 
pressions which correspond to th e mass f unction models sug¬ 
gested by IPress and Schechteil (Il974l ). ISheth and Tormenl 
(l999l) and Warren et al.l ( 20061) (hereafter referred to as 
PS, ST and Warren model respectively). The corresponding 
parameter constraints are summarised in Table [2] together 
with the abbreviations for the reference of each model. 
This unification of notation allows a more direct compar¬ 
ison between models. In Table[2]we also propose a constrain, 
which constitutes a new mass function fit. Its advantage is 
that it has as many free parameters as the Warren model, 
but matches the mass function better when we fit over the 
whole mass range, as we show later. In our analysis we will 
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0 12 3 

ln(v) 


Figure 1. Top: unconditional halo mass function, defined in 
equation 0 . as a function of the peak height v = S^/a 2 (m). 
Symbols show MICE-GC measurements with lcr errors based on 
FoF groups at the redshifts z = 0.0 and z = 0.5 (blue circles 
and red triangles respectivel y). Lines show the mass function 
model of iTinker et alj d201Qh , fitted to the measurements over 
the mass range M23 in the same colour coding as the symbols. 
Center: significance of the deviation between measurements and 
fits. Bottom: relative deviation between the fits at z=0.0 and 
z=0.5 (black solid line). The lcr errors of the measurements are 
shown as lines in the same colour coding as in the top panels. 
Vertical blue dashed and red dash-dotted lines denote the limits 
of the halo mass samples M0-M3 at z=0.0 and z=0.5 respectively, 
given in Table [T] 


model 

reference 

constraints 

Tinker 

Tinker et al. (2010) 

A, a, b, c, d free 

Warren 

Warren et al. 120061 

d = 0 

ST 

Sheth and Tormen 119991 

c = b, d = 1/2 

PS 

Press and Schechter (1974) 

a = 0, c = 1, d 

proposal 

this work 

c = 1 


Table 2. constraints of parameters in equation jhj corresponding 
to different mass function models. We refer to the models in the 
text using the abbreviations given in the left column. 


focus on the models of ST, Warren, Tinker and our pro¬ 
posal. We determine the best fitting parameters for each 
mass function model by minimising 

N b in 

(6) 

ij 


mass range halo masses [10 12 h 'Mg] N p 


M0123 > 0.58 > 20 

M123 ^ 2.32 ^ 80 

M23 ^ 9.26 ^ 316 

M012 0.58 - 100 20 - 3416 


Table 3. Halo mass ranges for mass function fits and clustering 
analysis. N p is the number of particles per halo. 

with A i = (Xf lt — Xi)/axi and X = vf(y). Cij and ax, are 
derived from our new JK estimator, introduced in Appendix 
m For searching the best fitting parameters we implemen¬ 
ted a Monte Carlo Markov Chain algorithm to explore the 
parameter space. 

In Fig. 0 we show the significance of the deviations 
between the mass function measurements and the best fits 
by the different models. Results are shown at the redshifts 
2 = 0.0 and z = 0.5, while at each redshift we fit the mass 
function over the different mass ranges, which are shown in 
Table [3] The first includes all halo mass samples (M0123), 
the second and the third exclude the low mass samples 
(M123, M23) and the fourth mass range excludes the highest 
mass sample (M012). For each fitting range we show fits 
based on seven different mass function binnings, dividing 
the mass range into 20, 25, 30,..., 50 logarithmic bins. We 
find that the deviations between fit and measurement can 
vary with the binning. However, we also see trends which 
are independent of this systematic effect. 

All mass function models show a clear dependence of 
the best fit on the chosen mass range, while this dependence 
is weaker for the ST model. The Tinker parameterization is 
the model which best fits the measurements at both redshifts 
and all mass ranges. This can be attributed to the fact that 
it contains the highest number of free parameters. The best 
fit parameters for the Tinker model are given in Table[4] For 
fits over the whole mass range (M0123) our proposed mass 
function model seems to match the measurements almost as 
good as the Tinker model, while having one free parameter 
less. It also has the advantage of producing stable values 
for the parameters regardless of the range used for the fit. 
When the fits are performed only at the highest mass range 
(M23) the Tinker and the Warren mass functions fit the data 
equally well, while the proposed model is a slightly worse fit. 
The ST model delivers the poorest fits in all cases. At z = 0.5 
we find strong deviations between fits and measurements 
when the fitting range includes the low mass sample M0. 
This indicates that the FoF detection of low mass haloes 
can be strongly affected by shot-noise, while this effect is 
stronger at higher redshift. 

For studying the goodness of the best fits for the differ¬ 
ent mass function models we present their best fit paramet¬ 
ers and the corresponding yf values per degree of freedom 
( d.o.f .) in Fig. [3] where the d.o.f. refer to the number of mass 
function bins used for the fit. Results are shown for fits over 
the mass ranges M0123, M123 and M23, which correspond 
to the different minimum peak heights given by the x-axis. 
For clarity we show here only results at redshift z = 0.0, 
while we find similar results at z = 0.5. For each fit we 
show mean results with standard deviations from the seven 
mass binnings mentioned previously. In addition to the res¬ 
ults derived by taking the covariance between different mass 
function bins into account in the fitting procedure we show 
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z = 0.0 z = 0.5 



0 1 2 3 0 1 2 3 

ln(v) 


Figure 2. Significance of the deviations between mass function fits and measurements versus the peak height v = S^/a^irn). Panels from 
top to bottom show results for fits over the mass ranges M0123, M123, M23 and M012 respectively. These ranges are marked by thick 
grey horizontal lines. Grey vertical lines denote the minimum and maximum peak heights of the different halo mass samples M0-M3. 
Dash-dotted horizontal lines denote la deviations between fits and measurements. Results for the redshifts 2: = 0.0 (z = 0.5) are shown 
in the left (right) panels. Coloured lines show fits to the models of Tinker (solid blue), Warren (dashed-dotted green) and ST (dashed 
orange), while fits to our proposed model are shown as red dashed-double-dotted lines. For each model we show seven fits, which were 
derived from mass function measurements based on dividing the whole mass range into 20, 25, 30, ...,50 bins. 


z 

mass range 

A 

a 

b 

c 

d 

x 2 min 

d.o.f. 

0.0 

M0123 

0.28 

1.80 

0.22 

1.08 

0.47 

25.6 

0.5 

M0123 

0.31 

2.74 

0.20 

1.37 

0.87 

125.6 

0.0 

M123 

0.24 

1.39 

0.22 

0.94 

0.34 

3.4 

0.5 

M123 

0.26 

1.70 

0.17 

0.98 

0.45 

3.5 

0.0 

M23 

0.17 

1.10 

0.55 

0.85 

0.01 

1.5 

0.5 

M23 

0.22 

1.28 

0.34 

0.86 

0.05 

1.6 


Table 4. Best fit parameters for the Tinker mass function model 
taking covariance between different mass bins into account. We 
show fits over the mass ranges, M0123, M123 and M23, defined 
in Table [3] also displayed in Fig. [3] for z = 0.0. We show the 
mean of fits with different binnings. The corresponding standard 
deviations are typically at the 2% level. 

results which were computed neglecting the covariance. We 
find that neglecting the covariance can lead to different best 


fit parameters, especially when the low mass range, where 
the off-diagonal elements of the covariance have the highest 
amplitudes, is included in the analysis (see Appendix HI. 
However, the bias predictions are only weakly affected by 
the negligence of the covariance (see Fig. IC1I) . The conclu¬ 
sions of this article about the comparison between bias pre¬ 
dictions and measurements does not dependent strongly on 
the covariance use. 

The \ 2 /d.o.f. results, shown in the bottom panel of Fig. 
[3] are very high when the mass functions are fitted over the 
whole range (lowest minimum peak height). This poor per¬ 
formance, which is even apparent for the Tinker model with 
its five parameters, is probably related to the fact that our 
mass function measurements are not reliable in the low mass 
range. In fact the M0123 sample includes haloes with down 
to 20 particles. For such low numbers of particles per halo we 
expect strong systematic effect in the halo mass estimation 
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Tinker model: 
A.a.b.c.d free 


Warren model: ST model: proposed model: 

A,a,b,c free, d = 0 A,a,b free, d = 1/2, c = b A,a,b,d free, c = 1 



-0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 

^ n ( v min) M v min) ^ n ( v min) ^ n ( v min) 


Figure 3. Top: best fit parameters for different mass function models as a function of the minimum peak height i/ m ;„ (corresponding 
to the mass ranges M0123, M123 and M23, defined in Tabled used for fits at redshift z = 0.0. Symbols show the means with standard 
deviations derived from seven mass function binnings (see Fig. n. Results from fits performed with and without taking the covariance 
between different mass function bins into account are connected with solid and dashed lines respectively. In the latter case the symbols 
are slightly shifted to the left for clarity. Bottom: minimum /d.o.f. of the fits derived using our new error estimator with 8 3 JK 
samples. Note that errors can be smaller than the symbol size. We find similar results at redhsift z = 0.5. 


and therefore on the mass function (e.g. I Warren et alJl2005 
I More et al .1120 111 ). Furthermore, the halo samples might be 
contaminated with spuriously linked FoF groups. If the ana¬ 
lysis is performed using only the high mass sample M23 
(highest minimum peak height), the \ 2 /d.o.f. values for the 
best fit models drop down to values between unity and four. 
If we perform the fits ignoring off-diagonal elements in the 
covariance matrix we obtain substantially lower f 1 /d.o.f. 
values, especially when the fits are performed over the whole 
mass range. This demonstrates that the covariance cannot 
be neglected in the fit for the evaluation of the fitting per¬ 
formance of a mass function model. This statement is even 
true in the high mass range where the covariance is domin¬ 
ated by shot-noise. This is important as the goodness of the 
fit is the way to validate the predictions. 

We also see in Fig. [3] that the y 2 /d.o.f. can change for 
different mass function binnings, which can already be seen 
in Fig. [2] This dependence on the binning is also appar¬ 
ent when the off-diagonal elements of the covariance mat¬ 
rix are neglected in the fit. However, the best fit values of 
each model and the corresponding bias estimations are only 
weakly affected by this systematic effect. 

Interestingly the best fit parameters of the Tinker model 
have the same values as the ones from the Warren model 
when the fit is performed on the higher mass M23 sample. 
Consequently the minimum f 2 /d.o.f. are the same in both 
cases. This indicates that the parameter d, which is set to 
zero in the Warren model is not required for fitting the high 
mass range, but becomes necessary, when the low mass range 
is included in the fit. The x' 2 /d.o.f. values of our proposed 
model are smaller than those for the Warren model for min¬ 
imum peak heights of In(i-Wn) < 0 (M123). This agrees with 


the visual impression, gained from Fig. [3] that our proposed 
model delivers better mass function fits than the model of 
Warren, unless the analysis is restricted to the highest mass 
range M23. We come to the same conclusion when analysing 
the mass function at 2 = 0.5. 


2.3 Mass function universality 

In Fig. [T]we demonstrated that the mass function, when ex¬ 
pressed in terms of the peak-high u, depends only weakly 
on redshift. To verify if this universality also holds for 
other cosmologies we compare our mass function fits to the 
Tinker and W arren model with fi ts to the s ame models, 
compi led from IWarren rta^ jIgOOtj^Table 8), I Tinker et al.1 
(l2010l. Table 4, Ag nn), ICrocce et al.l (l2010l. Table 2 ) and 


IWatson et al. d2013l. Tab le 2, FoF Uni.). Crocce et al.l rt2010l l 


and Watson et a l l d2013ll fit m ass fu nctions to the Warren 
model. Note that ICrocce et al. IHml also used simulations 
from the MICE simulation suite, with the same cosmology as 
MICE-GC, but rely on the nested boxes approach to cover 
a similar mass range, while havi ng a higher resoluti on in 
the low mass end than MICE-GC. I Tinker et al] d2O10Tl used 
spherical overdensities to define haloes. A universal beha¬ 
viour would not only be useful for PBS bias predictions, 
but also for constraining as with galaxy luminosity func¬ 
tions, statistics of t he initial den sity field and various other 
application (see e.g. IWhitell2002h . 

We compare our mass function fits with those from the 
literature in Fig. [4] We find that the different mass func¬ 
tion fits agree at the 10% level in the low mass end, but 
differ by up to 60% at high masses with a significance of 
about 2 <j in terms of error in the measurement. Depar- 
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Figure 4. Top: mass function fits compiled from the literature 
compared with MICE-GC measurements and fits from this work 
over the whole mass range (MO 123) and the high mass range 
(M23) at z •= 0.0. Grey and black symbols show measurements 
computed with and without Warren correction for halo masses re¬ 
spectively. All fits from this work are based on the latter. Bottom: 
relative deviations between fits and measurements in the same 
colour coding as the top panel. 
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tures from universality are expected for different cosmolo¬ 
gies but can also result f rom systematic effects , such as the 


halo mass definition (e.g. lLacev and Colt 

1994: Sheth et al. 

200ll: Jenkins et al J 200ll: IWhite 2002: 

Reed et al.l 

2007|; 

Lukic et al. 20071: [Tinker et al.l 

20081: 

Jrocce et al .1 

201 (ll: 

Courtin et al. 201 ll: More et al. 

201 ll: 

3hattacharva et al. 

201 ll: lOastorina et al. 2014|l. Furthermore, the fitting pro- 


cedure affects the presented comparison as well. 


The comparis on between the Warren fit from 
ICrocce et al.1 (2010) and from this work reveals the strong 
impact of the latter systematic effects on the fit. These two 
fits agree well in the high mass end, when the fit is per¬ 
formed over the whole mass range M0123. Interestingly we 
find that these fits differ more strongly from the measure¬ 
ments in the high mass end than fits from other simulations. 
Excluding lower masses from the fit (M23) leads to a better 
agreement between our fit and the measurement in the high 
mass end an d therefore to a stro nger difference between the 


results from 

Crocce et al. ( 

20id) and ours. The lower amp- 

litude of the 

Crocce et al.l 

20KJ) fit at low masses indicates 


that the low halo mass MICE-GC halo samples include more 
spuriously linked FoF groups, which can be expected from 
the low resolution as we concluded before in this section. 
Furthermore, a lower mass resolution leads to an overes- 
ti mation of halo masses . Correcting th is effect as su g gested 
by IWarren et al.l (l2006l ~) and done by ICrocce et al.l (T2010l i 
results in a decrease of the amplitude, which is shown as 


£2 = 


Ei = 


E 2 /E 1 = 


E 3 /E 1 = 


ci/—2d 


cu(cu — 4d— l)+2d(2d— 1) 

5? 


cv[(cv) 2 -6(d+l/2)cis+12d 2 ]-Sd 3 + 12d 2 -4d 
_ —2a 

MC6G °+i] 


—2a+2ci/—4d+l 

5 C 


4a 2 + 12a(d—l/2)+2(2d—l) 2 +4d(d—1)—6ci/(2d+a)+3(ci/) 2 

<5? 


Table 5. Coeffic i ents f or computing halo bias parameters from 
the I Tinker et al.1 d2Q10T) mass function model via equations 0. 

and 0. a, b, c and d, are the free parameters in the Tinker 
model. Bias predictions for other mass function models can be 
obtained by using the constraints for the fitting parameters, given 
in Table [2] 


grey symbols in Fig. [4] The fact that our Warren corrected 
mass function is lower than all mass function fits in the low 
mass range (ln(z^) < 0) indicates that the Warren correction 
leads to an underestimation of halo masses when it is applied 
on FoF groups with order of 10 particles. For intermediate 
masses (0 < In (u) < 2) our Warren corrected measurements 
are in better agreement with the results from lCrocce et akl 
d2oidh than those without Warren correction. A compar¬ 
ison between the Warre n corrected MICE-G C mass func- 
tio n at z = 0.0 and th e ICrocce et al.1 d20ldl fit, presented 
by lCrocce et al.1 (l2013li . also shows higher amplitude of the 
prediction compared to the measurement in the highest mass 
bin at 6 10 14 Mq/i _1 and an opposite trend for lower masses. 


3 PBS BIAS PREDICTIONS 


The bias parameters 6jv, introduced in equation ©. can be 
obtained from derivatives of the halo mass function via the 
PBS approach ( [Bardeen et al.lfl 9861 : Cole and Kaiseilll98£ : 
iMo and Whitel Il996h . Following Scoccimarro et all (12001 ) 
we derive the first-, second- and third-order bias parameters 
from the mass function fits as 


bi{y) — 1 + £i + Ei, 


(7) 


b 2 (v) = 2(1 + <22)(ei + El) + £2 + E 2 , 


(8) 


63(1/) = 6(a2+ffl3)(ei+-E'i)+3(l-|-2a2)(e2-l--E2)-|-£3+-E'3, (9) 

where the parameters a 2 = —17/21 and 03 = 341/567 are 
given by the spherical collapse model. Ei, E 2 , E 3 , ei, e 2 and 
£3 are computed from the fitted parameters in the mass func¬ 
tion models as shown in Table [5] Note that the non-linear 
bias parameters (equations (j8]) and @), derived from the ex¬ 
pressions in Table [S] are here presented for the first time for 
the Tinker model. Applying the parameter constraints from 
Table [2] delivers the equivalent expression for the PS, ST 
and the Warren models, as well as for our proposed model. 

Predictions for for, C 2 = b 2 /bi and C 3 = 63/61, derived 
from the Tinker mass function fits at z = 0.0, are shown as 
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a function of FoF halo mass in Fig. [5] The results are based 
on mass function fits over the whole mass range (M0123) 
and fits over the higher mass ranges (M123 and M23). The 
fei predictions for the different fitting ranges agree in the 
high mass end where the fitting ranges overlap and the mass 
function fits agree as well (see Fig. 0 . In the low mass end we 
find a clear, but relatively weak dependence of the linear bias 
prediction on the fitting range. In the case of C 2 and C 3 this 
dependence is stronger and reaches to higher halo masses. 
This indicates that second- and third-order derivatives of the 
mass function, used to derive C 2 and C 3 , cannot be measured 
as reliable as first-order derivatives, used to derive 61 . We see 
the same trends when employing the ST and Warren mass 
function models as well as for our proposed model, while in 
these cases the dependence on the fitting range is weaker 
(see Appendix 0 . We also find a similar behaviour of the 
bias predictions at z = 0.5. 

The absolute deviations between bias prediction from 
the Tinker mass function, fitted over the range M123 and 
other predictions are shown in Fig. [ 6 ] These other predic¬ 
tions are based on Tinker and Warren fits over different 
mass ranges and fits for the same models compiled from the 
literature. We do not show relative deviations to avoid sin¬ 
gularities at the zero crossings of C2 and C3. For the linear 
bias we find absolute deviations between the different pre¬ 
dictions of A 61 ~ 0 . 2 , which roughly corresponds to relative 
deviations of around 10%. The relative deviations for C 2 and 
C3 are around 50%, but can go up to more than 100%. Mass 
function fits over the high mass range M23 to the Tinker 
and Warren models deliver almost identical bias predictions, 
which can be expected since also the fitted parameters are 
very similar (see Fig. 0. In the high mass end these two bias 
predictions agree with prediction from the fit to the Warren 
mass function given bv lWatson ct al. . ( 2013 ). Comparing our 
results to those of lCrocce et alJ 1 201C ) we find a reasonable 
agreement for bias predictions based on the Warren model 
fitted over the whole mass range M0123. 


3.1 Universal relation between bias parameters 

A universal behaviour of the mass function, as studied in 
Section l2.2l would suggest that the bias parameters, derived 
from the mass function are universal as well, when they are 
expressed as a function of peak height v. Our comparison 
with the literature shows that both, the mass function from 
different simulations and the bias parameters derived from 
these mass functions (especially C2 and C3) can differ signific¬ 
antly from each other. These disagreements might not only 
arise from different cosmologies, but also systematic effects, 
as discussed previously. 

We now aim to verify the universality of the rela¬ 
tion between the bias parameters. Such a universal be¬ 
haviour would be useful for reducing uncertainties in lin- 
ear bias measurements from third-order statist ics (e.g. 
iManera and Gaztanagall201ll :lH offmann~et al.|[2015l 'l. In Fig. 
0we show the PBS prediction of the second- and third-order 
bias parameters, C 2 and C 3 , as a function of the prediction for 
the linear bias 61 . We find a 10% agreement for the 61 —C 2 , C 3 
relations for large values of the linear bias (6 1 > 1.5). These 
relations appear to be well described by second- and third- 
order polynomials in the case of C 2 = 62/61 and C 3 = 63/61 



log(h _1 M h ) 

Figure 5. Bias parameters 61, C2 = 62/61 and C3 = 63/61 (top, 
central and bottom panels respectively), derived from mass func¬ 
tion fits of the Tinker model via the PBS approach at z = 0.0. 
Grey lines show results based of mass function fits over the whole 
mass range M0123, blue and red lines show results from mass 
function fits which exclude the lowest and the two lowest mass 
samples (M123 and M23 respectively). Results based on fits to 
mass function measurements with 20, 30 and 40 bins are shown 
as dashed, dashed-dotted and dashed-double-dotted lines respect¬ 
ively. Results derived from fits of other mass function models per¬ 
formed in this work are shown in Fig. ICTl 
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Figure 6. Deviations of PBS predictions for bi, C2 and C3 at 2 = 0.0 (top, central and bottom panel respectively) derived from various 
mass functions fits with respect to the bias from our Tinker mass function fit over the range M123 (shown as blue line in Fig. [5j as 
function of the peak height v = <5^/cr 2 (ra). Deviations between 10 — 30% are marked by grey areas. The line color coding is the same as 
in Fig. [4| Vertical dashed lines denote the v limits of the halo mass samples M0-M3. 


respectively, with 

N 

b N = Y: ( io ) 

ra =0 

as we demonstrate in the same figure. This finding can be 
expected from expressing 62 and 63 as functions of 61 with 
the PS model (Table [2j. For this model the parameters a n 
can be directly predicted as (00,01,02) = (0.51,-2.21,1) 
for 62 = biC2 and (00,01,02,03) = (-1.49,8.02,-6.64,1) 
for 63 = 61C3. However, we find smaller rms values with 
respect to the Tinker and Warren predictions for the 61 — C2 
and 61— C3 relations, when we leave o„<jv as free parameters. 
We show values for a n from fits to the Tinker predictions in 

Fig- 0 

For predictions based on our fits over the whole mass 
range, M0123, we find deviation from this universal be¬ 
haviour, while these results involve the low mass samples 
which we found to be unreliable previously, possibly due 
to low mass resolution and noise in the halo detection. For 
lower 61 values the different predictions differ more strongly 
from each other. However, a weakly universal relation, es¬ 
pecially between 61 and C2, might already help to improve 
bi constraints from third-order statistics as these two para¬ 


meters are usually treated as independent. A comparison 
of the b 1 — 62 relation, predicted by the PBS with measure¬ 
ments from c ombined sec ond- and third-order clustering was 
presented by lSaito et al.l (l2014h . who also find that this rela¬ 
tion is consistent with redshift independence. We will pursue 
the study of this relation with different measurements of fci 
and C2 in a future analysis (Bel, Hoffmann & Gaztanaga in 
preparation). 


4 BIAS PREDICTION VERSUS 

MEASUREMENTS FROM CLUSTERING 

In the previous section we found that the PBS bias predic¬ 
tions depend on the employed mass function model and the 
mass range over which the models are fitted. We now aim 
to verify how the predictions for the linear bias, b 1, in these 
different cases compare to linear bias measurements from 
the two-point halo-matter cross-correlation. A comparison 
of second- and third-order bias parameter predictions with 
other measurements will be presented in a future analysis 
(Bel, Hoffmann & Gaztanaga in preparation). 

The two-point cross-correlation between halo- and mat¬ 
ter density fields, £ *, can be measured as the mean product 
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Figure 7. Second- and third-order bias parameters, C2 = 62/^1 
and 03 = 63/61, as a function of the linear bias parameter 61, pre¬ 
dicted from the PBS model (top and bottom panel respectively). 
Results from this work are based on mass function fits over the 
mass range Ml23. Results from the literature are shown in the 
same colour coding as in Fig. [4] Note that the mas s function fits 
from lCrocce et ahl d2Q10T ) and I Watson et ahl d 20131 ) are based on 
the Warren model. Black solid lines show polynomials (equation 
(HOD ), which were fitted to the PBS predictions of the Tinker 
model, based on MICE-GC mass function fits from this work at 
2: = 0.0 (magenta dashed line) with rms per degree of freedom of 
0.02 and 0.12 and for C2 and C3 respectively. 


of smoothed fluctuations 5(r) = (p(r) — p)/p of each density 
field, p(r), at the positions 1*1 and r2 as a function of the 
scale 7*12 = |ri — r21, 

£ x (r 12 ) = <M n)Mra)>. (H) 

The measurements for the four halo mass samples M0-M3 
at the redshifts z = 0.0 and 3 = 0.5 are shown in the top 
panels of Fig. [8] The amplitude increases with halo mass as 
expected from the PBS predictions. The growth of matter 
fluctuations further contributes to a change with redshift. At 
around 110 h -1 Mpc £ x shows a local maximum which res¬ 


ults from baryonic acoustic oscillations in the initial power 
spectrum of the simulation. 

A relation between the two-point halo-matter cross¬ 
correlation and the two-point matter auto-correlation, 
£ m (r 12) = (<5m(ri)<5 m (r2)), via the halo bias can be ob¬ 
tained by inserting the local bias model from equation © 
into equation CD, 

(r*l 2 ) — 6l £m(jT 2 ) + 0 [£ m ]. (12) 

At large scales (n 2 > 20 h -1 Mpc) we expect the higher- 
order contributions o[£ m ] to be negligible, which allows for 
measurements of the linear bias as 


6 5 (ri 2 ) 


g X (n 2 ) 

£m(n 2 ) 


— bi. 


(13) 


The measurements of 65 are shown in the bottom panel of 
Fig. [8] We fit between 20 — 60 /i -1 Mpc, where the scale- 
independence is a good approximation. Non-linear terms 
impact at smaller scales, but also around the scale of 
baryonic acoustic oscillations. Comparing these bias meas¬ 
urements from the cr oss-correlation to those from the auto¬ 
correlation, shown in I Hoffmann et al.l (l2015l l , we find that 
non-linearities have a stronger effect on the autocorrela¬ 
tion. However, differences in the bias from auto- and cross¬ 
correlations are small compared to differences between these 
measurements and the PBS predictions. We present a de¬ 
tailed analysis on the im pact of non -li nea rities on bias from 
second-order statistics in lBel et al.l (120151 1. 

To compare the PBS predictions for the linear bias with 
the measurements from the two-point correlation we calcu¬ 
late the average bias prediction in each of the mass samples 
M0-M3, weighted with the halo number density n(m), 


b(M) 


ImZ b ( m ) n ( m ) dm 
ImZ n G n ) dm 


(14) 


Mi ow and M up are the lower and upper limits of each mass 
sample M, given in Table [Q PBS bi predictions, based on 
fits to the Tinker model over the mass range M123, are com¬ 
pared with the 6{(ri 2 ) measurements in the bottom panel 
of Fig. [8j For the high mass samples M2 and M3 we find 
clear deviations between measurements and predictions as 
the latter are significantly too low, on all scales. 

The dependence of these deviations on the mass func¬ 
tion model and the mass range in which the models are fitted 
is shown in Fig. [5] Fitting the mass function over the whole 
mass range, M0123, delivers bi predictions which tend to 
be 1 — 15%, below the measurements, except for the low 
mass samples M0 and Ml at z = 0.0, for which we find 
up to 5% deviations in the opposite direction. We find the 
strongest variations between bias predictions from different 
models when, i) the low mass range at 2 = 0.5 is included 
in the mass function fitting range, or ii) when the bias is 
predicted for mass samples which are not within the fitting 
range (e.g. bias predictions for the mass sample Ml, based 
on fits over the mass range M23). The first case i) might 
be explained by noise, contaminating the FoF halo detec¬ 
tion, which results in the poor mass function fits shown in 
Fig. [2] (see discussion in Section [2.211 . In this latter figure we 
also see that the mass function fits outside the fitting range 
can strongly differ for different models. This could cause the 
strong differences in the bias predictions, described above as 
case ii). We do not see that the deviations between PBS bias 
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Figure 8. Top : two-point correlation £ of the MICE-GC dark matter field (continuous lines) and the two-point halo-matter cross¬ 
correlation for the halo mass samples M0-M3 (blue circles, green crosses, orange squares and red triangles respectively) in the comoving 
outputs at redshift z = 0.0 (left) and z = 0.5 (right) as a function of scale 7*12• Bottom : linear bias parameter b £ derived from the 
two-point correlations via equation Coloured lines are x 2_ fits between 20 — 60 h 1 Mpc. The minimum \ 2 values per degree of 

freedom are 1.05,1.96,1.54, 0.23 for M0, Ml, M2, M3 respectively at z = 0.0 and 0.79,1.46,1.23, 0.77 for M0, Ml, M2, M3 respectively 
at z = 0.5. Grey lines show PBS bias predictions. The same figure for the auto-correlation is shown in lHoffmann et alJ feoi ST). 


predictions and measurements decrease when the analysis 
is restricted to the higher mass range. This is true for both 
reds hifts (z = 0.0 and z = 0. 5) and consistent with results 
from lManera and Gaztanagal (|201ll ). 

However, restricting the fitting range to the higher mass 
range M23 we find a good agreement between the linear bias 
predictions from different mass functions models at 2 = 0.0 
and a = 0.5. The fact that the fitting performance strongly 
differs for the different models (see Fig. 0, while all models 
predict a linear bias with similar deviations to the measure¬ 
ments, suggests that the goodness of the mass function fit 
is not the only reason for these deviations, as mentioned in 
the intro duction to this artic le. These results line up with 
reports of lManera et al.i :' 2 Q 1 01. who also find the linear PBS 
bias prediction to lie below measurements from the power 
spectrum and two-point correlations, especially at high halo 
masses. As in our case their result is independent of the em¬ 
ployed mass function model and the way it is fitted to the 
measurements. 

Furthermore, these authors investigate if differences 
between the predictions and measurements are related to 
the mass definition of haloes. They therefore perform their 
analysis using FoF groups with different linking lengths, as 
well as spherical overdensities to define halo masses. In both 
cases they find that the PBS model underpredicts the linear 
bias measurements. In fact one could expect that the halo 
mass should be higher than those of FoF groups in order 
to match the PBS predictions (since shifting the meas¬ 


urements in Fig. [9] to higher masses would decrease the de¬ 
viations between measurements and predictions). However, 
halo masses defined by sp herical overdensitie s tend to be 
below those of FoF groups dTinker et al.ll2008l ). This should 
lead to higher measurements of the linear bias for spherical 
overdensities within a given mass range tha n correspond¬ 
ing m easurements for FoF groups, as found bv I Tinker et al.l 
(120101) . The 10% underprediction of linear bias measure¬ 
ments by the PBS model, which we see for high mass haloes 
in Fig. [9] is therefore probably a lower bound. The considera¬ 
tion above also suggests that applying the Warren correction 
on the FoF masses could increase the differences between the 
PBS bias predictions and the measurements. Hence, these 
differences might not only be related to the halo mass defin¬ 
ition, but also to assumptions of the PBS model, such as 
spherical collapse or a local bias relation (e.g. ISchmidt et al.l 
120131 : IParaniape et al.ll2013l) . The conclusion, that bias pre¬ 
dictions are only weakly dependent on the employed mass 
function model does not hold for the higher-order bias pre¬ 
dictions C 2 and C 3 (see Fig. l5l l 6 land lClll . 

4.1 Bias ratios 

The degeneracy between the growth of matter fluctuations 
and the bias of observed galaxy samples is one of the largest 
uncertainties for constraints of cosmological models derived 
from large-scale structure observations. With estimations of 
the typical host halo masses of such galaxy samples the PBS 
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Figure 9. Top : linear bias parameters b\ for the halo mass samples M0-M3 in the MICE-GC comoving outputs at redshift 2: = 0.0 (left) 
and 2: = 0.5 (right) versus the mean halo mass of each sample. Measurements from the two-point halo-matter cross-correlations, £ x , via 
equation COD, , are shown as black crosses with lcr errors. PBS predictions, derived from MICE-GC halo mass function fits from this 
work are shown as coloured symbols, which are slightly shifted to the left on the mass axis for clarity. The mass range over which the 
mass function is fitted is marked by a thick grey horizontal line. Error bars for the predictions are standard deviations derived from the 
seven different mass functions binnings shown in Fig. [2] Bottom : relative difference between f>£ and the PBS predictions. The different 
panels show results for predictions based on mass function fits over the mass ranges M0123, M123, M23 and M012, from the top to the 
bottom respectively. 


model can be employed to predict the bias of these samples 
to break the growth-bias degeneracy. Besides the galaxies 
host halo mass estimation, the inaccuracy of the PBS bias 
prediction constitutes an additional source of error in this 
approach. Here we aim to quantify the impact of such in¬ 
accuracies on measurement of the linear growth factor. The 
considered growth measurements are based on the ratio of 
the correlation functions of galaxy samples at two differ¬ 
ent redshifts, z\ and Z 2 , multiplied with the inverse rati o 
of the bias of these samples (see e.g. iHoffmann et al.ll2015h . 
The bias ratio needs to be estimated or predicted, while its 
uncertainties propagate linearly into the growth measure¬ 
ments. 


In Fig. [10] we show the PBS bias ratio predictions for 
the redshifts z\ = 0.0 and Z 2 = 0.5 and all combinations of 
the four halo mass samples M0-M1 at these two redshifts. 
The predictions are based on fits of the Tinker model to 
the mass function of the mass range M123, which we found 
to be reliable at both redshifts previously. We find an over¬ 
all variation of 5 — 10% for the higher mass range M123, 
while deviations are stronger when the low mass sample M0 
at redshifts z = 0.0 is taken into account. This variation 
is stronger than uncertainties expected from the meas¬ 
urements. The strong deviations for the low mass range are 
expected due to the poor mass function fit including M0 at 
z = 0.5 (see Section 12.211 . The error in the bias ratio will 
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Figure 10. Top: The ratio of halo bias at the redshifts z = 0.0 
and z = 0.5, which could be used to measure the linear growth 
factor of matter fluctuations. The PBS predictions, shown as open 
circles, are based on fits of the Tinker model to mass function 
measurements over the mass range M123. Measurements from 
the two-point correlation are shown as crosses with 1 a errors, 
derived from propagating the error of b £ at the two redshifts. 
Note that errors are smaller than the symbol size. Bottom: relative 
deviations between predictions and measurements. 


propagate into 5 — 10% error of the growth factor measure¬ 
ment. This uncertainty is lower than the uncertainties found 
for growth measurements based on bias ratio estimation s 
from the three-point correlation (see iHoffmann et al.ll2015lf . 
However, the estimation of the galaxies host-halo mass will 
introduce additional limitation in breaking the growth-bias 
degeneracy. Furthermore, the precision of any HOD fitting 
or mass interpretation from clustering measurements will be 
affected at similar level. 


5 SUMMARY AND CONCLUSION 

We investigated bias predictions from the PBS model, de¬ 
rived via fits to MICE-GC FoF mass functions. The accuracy 
of this model was tested by comparing its predictions for the 
linear bias to direct measurements from two-point correla¬ 
tions. In order to verify how the bias predictions are affected 
by the goodness of the mass function fit, we study the per¬ 
formance of four mass function models, fitted over different 
mass ranges at the redshifts z = 0.0 and z — 0.5. These 
fits are based on a new mass function error and covariance 

estimator._ 

We show that the models of lPress and Schechteil d 1974fl . 


ISheth and Tormenl dl999l l and 1 Warren et al] (hood f are spe- 
cial cases of the mass function expression suggested by 
iTinker et al.l J20ld ~). as they correspond to certain values of 
free parameters in the Tinker model (see Tabled . This find¬ 
ing motivated us to propose a new model by fixing a different 
free parameter. The fitting performance of the mass function 
models, quantified by the minimum % 2 values per number of 
analysed mass function bins ( d.o.f .), shows strong variations 
among different models and fitting ranges (see Fig. H. All 
models match the measurements better when the low mass 
range is excluded from the analysis. This indicates resolu¬ 
tion effects, given that we analyse FoF groups with down to 
20 particles. We find that the model of ITinker et al.l (l2010l l 
shows the best overall performance, which can be expected 
since it contains the highest number of free parameters. Our 
proposed model delivers results similar to those from the 
Tinker model when the whole mass range is analysed, while 
it has one free parameter les s. Thes e two models outperform 
the model of IWarren et al.l J2006l l for fits over the whole 
mass range. A restriction to the high mass range (> 2.32 
10 12 h~ 1 Mp ) ) l eads t o very similar fittin g perfo rmance of the 
I Warren et al] (l2006l l and ITinker et al.l (l2010h models with 
minimum yf /d.o.f. values close to unity, whi le our proposal 
is slig htly worse. Fits to the model of ISheth and Tormenl 
(| 19991 1 show the most significant deviations to the measure¬ 
ments in all cases. These findings are independent of the 
mass function binning. We find that the inclusion of the 
covariance into the analysis substantially increases the min¬ 
imum x 2 values of the best fits and also has an impact on the 
best fit parameters. However, our PBS bias predictions are 
only very weakly affected by the mass function covariance, 
especially when the higher mass range is analysed, where 
errors are shot-noise dominated. 

The results described above can be affected by the way 
the mass function errors and the covariance between differ¬ 
ent mass function bins are estimated. We therefore conduc¬ 
ted a detailed study of these quantities which is presented 
in Appendix m Given the one MICE-GC realisation, we 
rely on the internal JK error estimator which we compared 
to theory predictions. The comparison reveals that the JK 
method is in good agreement with the predicted mass func¬ 
tion error only in the shot-noise dominated high mass range 
(> 5 1O 14 M 0 ), but overestimates the predictions by up to 
80% in the lower mass range, where the errors are dominated 
by sampling variance. We show that this difference arises be¬ 
cause the standard JK estimator assumes a wrong scaling 
relation between sampling variance and sample volume. By 
introducing an improved scaling relation, predicted from the 
linear matter power spectrum, we are able to propose a new 
mass function error estimator. Deviations between errors of 
our new estimator and the predictions are less than 10% (see 
Fig. EE]). The advantage of the new estimator with respect 
to predictions is that it does not rely on a model for halo bias 
and does not depend on the power spectrum normalisation. 
This approach to JK error estimations can also be applied 
to other statistics, such as two-point correlation functions 
(Hoffmann et al. in preparation). 

The presence of non-zero off-diagonal elements in the 
mass function covariance suggests that a similar covariance 
can be found in the luminosity function or th e stel lar mass 
function, as reported in the literature (e.g. ISmi th] 120121 : 
lBensonll2014l '). The latter work demonstrated that the neg- 
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ligence of covariance in the stellar mass function signific¬ 
antly affects parameter constraints in semi-analytic models 
of galaxy formation. In this case a correct estimation of the 
error and covariance might be important for a correct inter¬ 
pretation of observations within such models. 

Our FoF mass function measurements show no signific- 
ant (< 5%) change between the redshifts 2 = 0.0 and 2 = 0.5 
for haloes with more than 80 particles (corresponding to a 
lower halo mass limit of 2.32 10 12 h~ 1 M Q , see Fig.[lJ. When 
including lower masses, the redshift dependence is stronger, 
possibly because of redshift dependent noise in the low mass 
FoF detection. In order to investigate a dependence on cos¬ 
mology we compare our results with mass function fits from 
the literature. We find variations between 10% in the low 
mass end and up to 40 — 60% in the high mass end (see Fig. 
0 . This finding is in agreement with other studies on depar¬ 
tures from the mass function universality (see Section 12.31 
for references). The advantage of the MICE-GC simulation 
is the large ~ (3 Gpc /h) 3 volume, which leads to smaller er¬ 
rors in the high mass range and therefore allows for a more 
significant assessment of the aforementioned variations in 
the mass function amplitude. 

Our analysis demonstrated that numerical and system¬ 
atic effects, such as halo mass definition, resolution effects 
or the fitting procedure can contribute to variations in the 
mass function amplitude with a similar impact as differences 
in the cosmology (see Section ^. 2 II . This result indicates that 
understanding such numerical and systematic effects is im¬ 
portant for discriminating different cosmologies using the 
mass function. In fact it was shown in the literature that 
uncertainties in the mass function in the order of magnitude 
that we find in our analysis, can strongly affect constraints 
on the matter density, the dark-energy equation of state, the 


surveys such as DES or Euclid (see e.g. Crocce et al. 

201C 

Wu et al J 120101: ICostanzi Alunno Cerbolini et al. 

2013 

Weinberg et al. 20131; Applebv et alJ 20131: Basse et al. 

2014 

Bocauet et al. 2015|). 


After comparing fits from different mass function mod¬ 
els to MICE-GC measurements, we study th e bias predic¬ 
tion, d erived from the best performing model oflTinkc r et al.l 
(120101) . Note that the non-linear bias parameter expressions 
for this model are presented in this work for the first time. 
We find that the bias prediction depends on the mass range 
over which the model was fitted as the amplitude of the lin¬ 
ear bias predictions varies by around 10% for different htting 
ranges. For the second- and third-order bias parameters the 
amplitude can vary by more than 50%. These dependences 
of the bias predictions on the htting range are comparable 
with variations obtained when employing fits to other mass 
function models. Furthermore we find deviations with sim¬ 
ilar amplitudes in a comparison with bias prediction from 
mass function fits to other simulations, compiled from the 
literature (see Fig. 0. 

A universal behaviour of the mass function would sug¬ 
gest that the bias parameters, derived from the mass func¬ 
tion are universal as well. Despite the strong variation 
among different bias predictions we find a tight universal 
relation between bi and C2 or C3 for 61 > 1.5 across dif¬ 
ferent simulations and mass function models. For smaller 
61 values, these relations are more dependent on the mass 
function fit, but still quite tight. Using the PS mass function 


model we derive that the second- and third-order bias para¬ 
meters 62 and 63 can be expressed as second- and third-order 
polynomials of the linear bias 61 (see Fig.0. These findings 
suggests that the linear bias can, at least, constrain the non¬ 
linear bias parameters. This could be used to improve the 
linear bias measurements from third-order statistics. 

A common application of the PBS model is to predict 
the linear bias from clustering. We measured the latter dir¬ 
ectly from the two-point halo-matter cross-correlation at 
large scales in the MICE-GC and compare it to the PBS 
predictions. The comparison was conducted using four dif¬ 
ferent mass samples at the redshifts 2 = 0.0 and 2 = 0.5. Ex¬ 
cluding the low mass sample Ml with less than 80 particles 
per halo from the analysis, which we expect to be affected 
by noise, we find that the linear bias, predicted from the 
PBS, model lies 5 — 10% below results from the two-point 
correlation (see Fig. 0. This effect is similar at the red¬ 
shifts 2 = 0.0 and 2 = 0.5 and independent of the em¬ 
ployed mass function model and the way it i s fitted to the 
measurements, confirming previou s findings dManera et alJ 
l20ld : lManera and Gaztanagall201ll ). Including the low mass 
sample delivers similar results, but with a larger scatter 
among the models. From the analysis in the higher mass 
ranges we conclude that shortcomings in the fitting perform¬ 
ance of the mass function model are not the main reason for 
the discrepancy between PBS predictions for the linear bias 
and the corresponding measurements from clustering. An 
alternative reason for such discrepancies might be given by 
the overestimation of halo masses by the FoF algorithm, as 
those ten d to be larger than halo masses of spherical over¬ 
densities dTink c r et a l.l l2008l ). However, from our results in 
Fig. 0 we conclude that shifting the linear bias measure¬ 
ments of FoF halo samples to lower masses would increase 
the deviations between measurements and PBS predictions. 
Hence, if FoF masses are overestimations of the halo masses 
described by the PBS model then the differences between 
linear bias predictions and measurements, found in this ana¬ 
lysis, constitute lower bounds for the inaccuracy of the pre¬ 
dictions. This indicates that simple assumptions of the PBS 
model, such as a local bias model or spherical collapse might 
limit the accuracy of the linear bias predictions. We will 
present a comparison between the non-linear bias paramet¬ 
ers from predictions and measurements in Bel, Hoffmann & 
Gaztanaga (in preparation). 

The 5 — 10% deviations between linear bias predictions 
and measurements will affect at similar level the precision 
of any HOD fitting or mass interpretation from clustering 
measurements. We demonstrate the impact of these devi¬ 
ations on growth measurements from two-point correlations. 
Such measurements are based on the ratio of the linear bias 
at two different redshifts. Ignoring the unreliable low mass 
range we find 5 — 10% deviations between PBS predictions 
for the bias ratios and measurements from the two-point cor¬ 
relation. This inaccuracy would propagate linear into meas¬ 
urements of the linear growth factor, based on PBS bias 
predictions. 
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APPENDIX A: MASS FUNCTION 
MEASUREMENTS 


The mass function measurements are based on a rewritten 
form of equation O 


vHy) 


(m) dn(m, z ) dig m 
p dig m din v ' 


(Al) 


where (m) is the mean halo mass in each logarithmic mass 
bin. If the mass bins are chosen to be exactly equal in logar¬ 
ithmic space, the mass function amplitude slightly oscillates 
in the low mass range due to mass resolution effects. Since 
the errors are smallest in the low mass range this artefact 
can significantly affect the fits, causing a strong depend¬ 
ence of the fits on the number of mass function bins. Aim¬ 
ing to minimise this mass discreteness effect we determine 
the minimum and maximum number of particles per halo 
in each logarithmic mass bin, (N™ ax and N™ zrl respect¬ 
ively. The width of the mass bin in then recalculated as 
m p (N™ ax — N™ tn + 1), where m p is the particle mass. The 
value of v of each bin is calculated from the mean mass of 
haloes in the bin. The term ™ in eauation lAll is derived 
directly from equation((4|). 


APPENDIX B: COVARIANCE 

In order to fit the mass function we estimate the errors and 
the covariance between different mass bins. A direct meas¬ 
urement of these quantities would require a large set of in¬ 
dependent realisations of the simulation. Since just one real¬ 
isation of the MICE-GC simulation was run we estimate the 
errors and the covariance using the Jack-Knife (hereafter 
referred to as JK) sampling technique. To validate these es¬ 
timations we compare the results to theoretical predictions, 
which we will describe first. 


B1 Covariance prediction 

Following I Crocce et all (l20ldi we derive the covariance pre¬ 
diction for the comoving halo number density from the linear 
bias relation at large scales, 


where A* = (rf — hi) and (...) denotes the average over 
N S am P statistically independent volumes k (note that the A 
introduced here is not related to the A used in equation <[ 6 j 
for calculating the x 2 values of mass function fits). Inserting 
the expression for the number density n; of haloes in mass 
bin i from equation m leads to 

Cij = hihjbibj^m) + (SESD- (B3) 

The variance of matter fluctuations {5f) = af(mtot) can 
be derived from the power spectrum via equation ®> while 
mtot is the total mass within the volume in which the mass 
function is measured (in our case the total mass in the simu¬ 
lation) . Since this mass corresponds to a very large smooth¬ 
ing radius we can compute af from the linear power spec¬ 
trum. The sampling variance contribution to the covariance 
is therefore given by 

C-j = hihjbibjaffmtot) (B4) 

If the noise term 5 3n is Poissonian it averages out when 
taking the mean over many independent volumes. The con¬ 
tribution of shot-noise to the covariance is then given by 

C°? = 5ij^, (B5) 

while here 8ij is the Kronecker delta. Based on these con¬ 
siderations we can write the total covariance as 


y'T _ H' iS I (~1 sn 

ij — ij i '-'ij 


(B 6 ) 


A more formal der i vation for this relation i s give n 
by SrmlJLjandMMianl dgOD) , see also iRobertsonl d 20 Kill : 
IValageas et al. ( 2011 ); ISmith || 2012 |'I . The diagonal elements 
of the covariance matrix correspond to the predictions for 
the mass function variance, 


u 2 = Cu (B7) 

as given by ICrocce et al.l d20ld l. For fitting the mass 
function we work with the normalised covariance Cij = 
Cij/((JiUj) and differences normalised to at. (note that here 
<Ti refers to the variance of the mass function in the mass 
bin i and not to the variance of the matter field, cr m ). 


B2 Jack-Knife estimation of covariance 


n(m, r) = n[l + bi(m)5 m (r)] + Sn sn (m, r), (Bl) 

where n(m, r) = N(m,r)/Vtot is the number density of ha¬ 
loes with mass m in a volume (in our case the simulation 
volume) around position r, <5 m (r) is the matter density con¬ 
trast in the same volume and 61 = 5h/5m is the linear halo 
bias factor (as before m refers to the matter density field 
when it appears as lower index and to the halo mass when 
it is used as variable). The last term, 5n sn (m, r), corresponds 
to noise. We will assume Sn sri to be Poisson shot-noise and 
therefore independent of r. The predictions for the uncon¬ 
ditional mass function can be related to those for the halo 
number density via equation ©• For the sake of simplicity 
the following considerations are based on the latter. The co- 
variance matrix for number densities of haloes in the mass 
bins i and j is defined as 

Nsamp 

Cij = (AiAj) = —- Y, A ? A *> ( B2 ) 

J V samp 


For mass function fits in observations the covariance pre¬ 
diction is of limited use since it requires knowledge about 
the bias and the power spectrum in advance. This problem 
might be solved with an iterative approach for the fit, start¬ 
ing from an initial guess for the power spectrum and the lin¬ 
ear bias factor. Another possibility to obtain the covariance 
without knowledge of the bias and the power spectrum is to 
estimate it with the JK sampling technique. Testing this ap¬ 
proach we construct Njk JK samples by subtracting cubical 
sub-volumes (hereafter referred to as JK cells) with the size 
Vtot/NjK from the total simulation volume Vtot■ The basic 
assumption of the JK approach is tha t the error scales wit h 
the size of the subtracted volume (e.g. iNorberg et aLll2009l ). 
We follow the common approach by rescaling the covariance 
with the factor ( Njk — 1 ), which leads to 

N - 1 Njk 

C J ij K - ( Njk - lXAiAj) = J « JK Y A ^ A U ( B8 ) 
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Again A; = (n; — hi), but now (...) is the average over the 
different JK samples k, nf is the comoving number density 
of haloes in the mass bin i in each JK sample and hi is the 
corresponding halo number density in the whole simulation 
volume. Note that the rescaling factor, ( Njk — 1), is only 
weakly justified and can be improved, as we show in Section 
IB4I As in the case of the predictions the diagonal elements of 
Cfj K are the JK estimation for the variance (erf = Co) and 
we normalise Cfj K = Cfj K /(ai K af K ). Note that we can 
use the JK approach for studying the covariance between 
low and high mass bins because of the large mass range of 
the MICE-GC simulation. This analysis would not be pos¬ 
sible using nested boxes, where the different mass ranges are 
covered by different realisations with d i fferent box sizes (e. g. 
IWarren et alJliooil : ICrocce et al.ll20ld : iTinker et al.ll2010l ). 


B3 Covariance prediction versus Jack-Knife 
estimation 

A comparison between the error prediction from equation 
(IB6I) and the corresponding JK estimation from equation 
equation (1B8I) (with of = Co) is shown in Fig. IB II for the 
redshift 2 = 0.0. The error predictions are based on linear 
bias predictions from mass function fits to the Tinker model 
over the whole mass range, for which we expect uncertain¬ 
ties of around 10% (see Section n. From the prediction we 
expect the error to be dominated by sampling variance in 
the low mass end and by shot-noise in the high mass end. At 
halo masses of Mh — 2 10 13 Mq both sources are predicted 
to contribute equally to the total error. The JK error estim¬ 
ation is in good agreement with the predictions in the high 
mass end (Mh > 5 10 14 Mq). This indicates that the JK 
method is working well for different JK cell volumes when 
the error is dominated by shot-noise. Furthermore, the shot- 
noise is well described by a Poisson distribution. At halo 
masses lower than 5 10 14 Mq we find the JK error to be up 
to 80% higher than the prediction. 

This overes t imati on is consistent with results reported 
bv ICrocce et alj (120101 ) using the same simulation box size as 
the MICE-GC, while for smaller simulation boxes they find 
the JK error to be lower than the prediction. The fact that 
the overestimation of the JK error in the low mass end is 
larger for smaller JK cells indicates that the JK assumption 
of a linear relation between errors and volume is inadequate 
when sampling variance is the dominating source for error. 
However, increasing the size of the JK cells results in a smal¬ 
ler number of samples and therefore a stronger noise on the 
estimated error. In Fig. IB 11 we also show a new JK error 
estimation, which is in good agreement with the predictions 
at all masses. This new JK error is based on an improved 
scaling between the sampling variance in a JK cell and in 
the whole simulation box using the linear power spectrum, 
as explained in Section TB4I 

In Fig. lB2l we compare the normalised covariance of the 
mass function between the mass bins i and j, predicted via 
equation (fB6l) with the JK estimation from equation (IB8I) 
using 8 3 JK samples at 2 = 0.0. The sha pe of the covariance 
is in g ood agreement with results from ISmith and Marianl 
(1201 il l . The low mass bins are highly correlated because 
of sampling variance, while high mass bins are uncorrel¬ 
ated as their errors are dominated by shot-noise. For the 



12 13 14 15 
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Figure Bl. Top : Relative errors in the mass function. Lines 
show the predictions, including the sampling variance (dashed 
line) and the Poisson shot-noise contribution (dash-dotted) to¬ 
gether with the resulting total error (continuous line), derived 
from the equations (iBTt - jB6j (with a? = Ca). Open circles 
show the standard JK error estimation (equation JHU) and open 
triangles show the errors from a new JK estimation (equation 
(EH)). Both estimations are based on 8 3 JK cells. Bottom : re¬ 
lative deviations between the JK and the total predicted error. 
The symbol types corresponds to those in the top panel. Results 
derived from 4 3 cubical JK cells are shown as grey solid symbols. 

comparison of the variances we find a reasonable agreement 
between the prediction and the JK estimations, especially 
in the high mass end. In the low mass end the covariance 
seems to be overestimated by the JK approach, while the 
new JK method reproduces the prediction well. 

We show a more detailed comparison of the covariance 
amplitudes in Fig. IB3I fixing one mass bin i and varying the 
second mass bin j. For 8 3 JK cells we find the normalised JK 
covariance amplitudes to be higher than the predictions with 
differences of up to 0.3. Using larger JK cells this overestim¬ 
ation slightly decreases, while results become more noisy. 
Again the improved estimation is in better agreement with 
the prediction. We have verified that our conclusions also 
hold for redshift 2 = 0.5. 

B4 Improved JK estimator 

We now aim to understand the disagreement between the 
predicted mass function error and the corresponding JK 
estimation in order to improve the latter. The Njk JK 
samples are constructed by subtracting haloes in JK cells 
of the size Vtot/NjK from the total halo distribution. The 
number of haloes in the remaining JK sample is then given 
by Nj K = Nt 0 t — NjKceii • The volume of a JK sample 
is given by Vjk = V to t — VjKceii■ From the definition of 
the number density, (n = N h /V ), and the deviation from 
the mean over the total volume A = n — n one can de¬ 
rive A j K = — ( Njk — 1 ) ~ 1 A ( Vj k ceil ) ■ Note that this re- 
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Figure B2. Normalised covariance between mass function bins 
at redshift z = 0.0, derived from predictions (top, equations {EJ 
- HU), the standard JK estimator (center, equation (IB St ) and 
the new JK estimator (bottom, equation (|B13|) '). The estimations 
are based on 8 3 JK cells. 


lation also holds for the number density contrast, Sjk = 
— (Njk — 1 )~ 1 S(VjKceii)- Hence, subtracting an overdense 
JK cell from the total volume generates a slightly under- 
dense JK sample. This result leads to a relation between 
the variances of the number density, a 2 = (A 2 ), in the JK 
cells, and the corresponding variance for the JK samples, 



mass function bin j 

Figure B3. Deviations between the standard and the new JK 
covariance estimation (solid and dashed lines respectively) and 
the prediction, fixing one bin to i = 10 and varying the other. We 
find similar results for different values of i. Black and grey lines 
show results based on 8 3 and 4 3 JK cells respectively. 


a 2 (Vj Kce ii) = (Nj K -l) 2 (A 2 JK ). (B9) 

As for equation (lB2l) (...) denotes the ensemble average. The 
variance of the JK samples is therefore simply related to the 
variance at the scales of the JK cells. Note that (A 2 JK ) is not 
the variance at the scale of the JK sample volume, cr 2 (Vjk), 
since the JK samples are not independent from each other. 

From the linear bias model we assume that the variance 
of the halo number density results from shot-noise ( a 2 n ) and 
sampling variance (cr^), as explained in Section [B2l The lat¬ 
ter contribution to the total variance of the JK cells, meas¬ 
ured via equation (IB9I) . is therefore given by 

of{Vj Keen) = ( Njk — 1) 2 (Ajx) — CTsn(VjKcell)- (B10) 

Since n/VjKceii = NjKn/Vtot, the shot-noise for JK cells is 
related to the shot-noise of the whole box as cr^CVjKceii) = 
NjK(Tsn(Vtot)- To obtain the sampling variance at the scale 
of the simulation box, cr 2 (Vtot), we multiply a 2 (VjKceii ) with 
a rescaling factor 

r CT = a 2 {Vtot)/a 2 s {VjKceii), (Bll) 

which can be predicted from the linear matter power spec¬ 
trum. This prediction is based on the assumption that, at 
large smoothing scales, the sampling variance of the halo 
number density is related to the dark matter variance by 
the linear bias factor, a 2 = bj_af. Since &i is constant at 
large scales (see Fig.[U, it cancels out in the rescaling factor, 
hence = r ™ = r a . The prediction is then based on cr m (l/), 
computed from the linear matter power spectrum via equa¬ 
tion ©• We can now write the expression for the sampling 
variance of the simulation box, based on equation (IBlOh in 
the general case of the covariance 


CtfVtot) = r a [{N JK - l) 2 (AiAj) - N JK C!f(y t ot)]. (B12) 
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Note that we have now dropped the index JK in A for 
simplicity and to be consistent with equation (IB8I) for the 
standard JK estimator. As in the latter equation the lower 
indices refer to the mass bins i and j. With the Poisson shot- 
noise term from equation (IB5I) . the total covariance is then 
given by equation (lB6|) as Cij = C?- + Cl ?. The resulting 
expression constitutes a new error estimation for the mass 
function which combines direct measurements of sampling 
variance via the JK sampling with predictions for the res¬ 
caling factor and the shot-noise. This new estimator can be 
written more explicitly as 

C?f wJK = r a (Nj K - l) 2 (AiAj) + - r a Nj K ). 

Vtot 

(B13) 

As before the diagonal elements correspond to the vari¬ 
ance, <j 2 = Cu. Note that for Poisson shot-noise domin¬ 
ated errors the sampling variance can be approximated as 
Clj(Vtot) — 0 and the new estimator reduces to the shot- 
noise term, C?J wJK ~ 5a . In this case we derive from 

’ _ J Vtot 

equation (IB12I) that ( N jk — 1) 2 (A, A? ~ Nj K Cl?(Vtot)- 
For large numbers of JK samples ( Njk — Njk — 1) this 
expression is equivalent to ( Njk — l)(AiAj) ~ 5ij ■ 
The left hand side of this relation is the standard JK es¬ 
timator. This consideration explains the good agreement 
between standard JK estimator with the improved JK es¬ 
timator and the predictions at high masses, where the er¬ 
rors and the covariance are shot-noise dominated (Fig. IB1I 
IB2I and IB3I) . In the low mass range our new method is in 
much better agreement with the predictions than the stand¬ 
ard JK error estimator (IB8I) . This can be understood with 
the following consideration. For large numbers of JK samples 
( Njk — Njk — 1) the new estimator corresponds to the 
standard .JK estimator if r a = al n {V to t)/crl n {VjKceii) = 
1 /Njk ■ Since V to t = NjkVjkcM, this condition is equi¬ 
valent to VtotVm{Vtot) = VjKceii&rn {Njk cell)• The JK ap¬ 
proach can therefore be described as the assumption that 
<j m (V) ~ V~ 1/2 for large Njk ■ We show <r m (K), computed 
from the linear power spectrum via equation a in Fig. IB4I 
The JK assumption is in a clear disagreement with the pre¬ 
diction which causes a too high r a and therefore an overes¬ 
timation of sampling variances at the scale of the simulation 
box for the standard JK assumption. 

The advantage of the new JK estimation with respect 
to the prediction is that it does not require knowledge of the 
halo bias. Furthermore, this new approach is independent of 
the normalisation of the power spectrum as it cancels out 
in the rescaling factor (equation (IB 1 II) J. However, the large 
scale power spectrum still needs to be known for accurate 
rescaling of the sampling variance via cr m ( V). For simula¬ 
tions the linear power spectrum is given. In this case the 
new method can be used instead of running several realisa¬ 
tions for deriving mass function errors and covariances. In 
observations the large scale power spectrum can only be as¬ 
sumed. However, with such an assumption the accuracy of 
the error estimation can still be improved with respect to 
the standard JK method, which also implies the strong as¬ 
sumption of cr(V) ~ K -1 / 2 . The advantage of the new JK 
estimation with respect to using independent subvolumes 
for the error estimation is that the JK samples cover lar¬ 
ger volumes with larger average numbers of massive haloes. 
The covariance between the low- and high mass end of the 



Figure B4. The standard deviation of matter fluctuations, cr m , 
predicted from the linear MICE-GC power spectrum via equa¬ 
tion 0 as a function of the volume, V, of the spherical top 
hat smoothing window (solid line). The dash-dotted line shows 
the cr m (V) relation which corresponds to the standard JK es¬ 
timator (with an arbitrary normalisation, chosen to coincidence 
with the predictions at the MICE-GC simulation volume V). The 
volumes of the simulation box and the JK cells are shown as ver¬ 
tical dashed lines. 

mass function is therefore better sampled by JK samples 
than subvolumes. We employ our new method for the error 
and covariance estimation using Njk = 8 3 samples. 


APPENDIX C: PBS BIAS PREDICTIONS 

We show in Fig. ED the PBS bias predictions based on the 
mass function models studied in this analysis. The differ¬ 
ent predictions are based on fits over the four mass ranges 
M0123, M123, M23 and M012, defined in Tabled The figure 
is analogous to Figure [5] We find that the linear bias para¬ 
meter 6 i is less sensitive to the mass function model and the 
mass function fitting range than the non-linear bias para¬ 
meters C 2 = 62/61 and C3 = 63/61. I addition to Figure [5] we 
show that the bias predictions become unstable when the low 
mass sample, Ml, is included in the analysis. Furthermore 
we show bias predictions, based on mass function fits over 
the range M123, which were derived neglecting the covari¬ 
ance between different mass function bins. We find that the 
for this example the mass function covariance has a smaller 
impact on the bias prediction than the choice of the mass 
function model, or the mass function fitting range. We ex¬ 
pect the impact of the mass function covariance to increase, 
when low mass samples are included in the fit. However, the 
low mass range is hard to access for analysis of halo abund¬ 
ance in the MICE-GC, due to resolution effects (see Section 
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Figure Cl. Same as Fig. [5] but for all mass function models analysed in this work. In addition we show bias predictions based on mass 
function fits over the whole MICE-GC mass range (M0123). For predictions from fits over the mass range M123 we show in addition 
results derived without taking the covariance in the measurements between different mass function bins into account as light blue lines. 
Note that these lines are covered by other results in most cases. 










































